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Abstract 

The time dependence of the /i-index is analyzed by considering the average 
behaviour of /i as a function of the academic age Aa for about 1400 Italian 
physicists, with career lengths spanning from 3 to 46 years. The individual 
/i- index is strongly correlated with the square root of the total citations A^^: 
h ^ 0.53^y Nc- For academic ages ranging from 12 to 24 years, the distri- 
bution of the time scaled index hj^J Aa is approximately time- independent 
and it is well described by the Gompertz function. The time scaled index 
hj \/ Aa has an average approximately equal to 3.8 and a standard deviation 
approximately equal to 1.6. Finally, the time scaled index hj \/ Aa appears 
to be strongly correlated with the contemporary /i-index h^- 

Keywords: /i-index, time dependence, time scaled /i-index, contemporary 
/i-index 



1. Introduction 



One of the purposes of modern bibliometrics is to introduce some quan- 
titative indicators of the scientific production of individuals, aiming at es- 
tablishing some rough classification or ranking. An indicator w hich has been 
gaining much attention is the Hirsch index h 



Hirsch. 2005 



given an m- 

dividual with N publications, h is defined as the number of papers which 
received at least h citations, while the remaining N — h papers received less 
than h citations. Given that the /i-index increases monotonously with the 
age of the scientist involved, its time dependence has been a relatively long- 
standing problem of bibliometrics, with deep consequences on the possibility 



* Corresponding author 
Email addresses: inannella@df.uiiipi.it (Riccardo Mannella), rossi@df.unipi.it 
(Paolo Rossi) 



Preprint submitted to Journal of Informetrics 



July 17, 2012 



of comparing scientists showing substantial differences in their academic age 
A A, that is the length (i n years) of the ir academic career. 

In his original paper, iHirschI 2005| proposed that the /i-index would be 
growing roughly linearly in time, and he therefore suggested the introduction 
of the m-index, simply defined as the ratio between the /i-index and the 
time lapse (in years) between the first publication and the present date: 
m = h/Ti. 



However iGuns and Rousseaul [2009l | showed by numerical simulations in a 
model of the citation dynamics that the functional dependence of the growth 
may be affected by a number of different deterministic and stochastic factors, 
and linearity i s not alw ays assured. Absence of linear i ty wa s observed also by 



Egghd |2009al Jbl. l2010l | and IWu. Lozano and Helbing] |2011| . In view of these 



results, it seems rather difficult to construct a robust indicator allowing a 
precise ranking of scientists with different career lengths. 

On the other hand, when the goal is to establish a benchmark of scientific 
quality and productivity acting as a threshold for recruitment and promotion, 
we are no longer bound to exploring the exact dependence of the index on 
individual careers: rather, we may consider the statistical average for suffi- 
ciently large groups as a proxy for an ideal temporal dependence of scientific 
production and of its impact, and to establish whether these averages show 
some general behaviour. 

We collected the bibliometric data of about 1400 Italian physicists (ran- 
domly chosen among the approximately 2400 Physics academic staff em- 
ployed in Italian Universities at the end of 2010) using the SCOPUS database, 
grouped according to the date of their first scientific publication appearing 
on the database, from years 1965 to 2008. We then computed the average 
of the total citations and of the /i-index for each annual group, and stud- 
ied the correlations of these indicators between each other and with time. 
Clearly, the Tl for each group is given by the difference between the time 
of data extraction and the year labeling each annual class, and we identified 
the academic age A a with Tl. 

We anticipate our main conclusions: 

• the individual /i-index is very strongly c orrelate d with the total number 
of individual cita tions, as suggested by IHirschI |2005| and emphasized 
bv lNielsenI |2008| . 



the ratio between (group averaged) total citations and academic age 
shows three markedly different behaviours. The ratio grows (roughly 
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linearly) with time during the first ten years; it stabilizes at a relatively 
constant (plateau) value for at least fifteen years; it then decreases to 
reach a second constant, but lower, value, for longer academic ages Aa 

• a similar pattern (which we believe to follow from the observed time 
dependence of the above ratio) is shown by the ratio between the (group 
averaged) /i-index and the square root of the academic age Aa 

• the ratio between the individual /i-index and the square root of the 
academic age (/i/a/^a) appears to be strongly correlated with the con- 
temporary /i-index 

• to assess scientists who have been active for more than ten years, it ap- 
pears reasonable to compare the index h/ \/ Aa to the observed plateau 
values 

2. The correlation between the total number of citations and the 
/i-index 

As first explained by Hirsch, the relationship between total number of 
citations Nc of individuals and their /i-index is expected to take the general 
form = O'h'^ ) with 3 < a < 5, although there seems to be no obvious the- 
oretical reason why the parameter a should have some special and universal 
value. 

Fig. [1] shows the relation between h and Nc'- a clear linear relation is 
visible when h is plotted against \/Nc, confirming the empirical suggestion 
by Hirsch. The correlation between the two variables in the plot is 0.97. The 
straight line is a best fit, using a relation of the form h = a\/Nc, and the 
resulting slope is a = 0.53, corresponding to a value a ~ 3.5. 

We also examined more restricted communities (like theoreticians and 
experimentalists, or senior and junior researchers) finding typically that a 
changes only very mildly among different communities. The resulting a's are 
summarized on table [H 

We notice that indeed most of the a values are in the range 3 to 4, and that 
the correlation coefficient is very close to one, in most cases. Furthermore, it 
is interesting to note that a tends to change little within each age category, 
possibly with the exception of the HEP experimental research associates 
(who show an a significantly larger than the a of the research associates of 
other fields): this is easily understood when we recall that research in this 
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Figure 1: The /i-index vs the square root of the total number of citations iVc, for each 
scientist considered in this work (dots). The straight hne is the relation h — a^J Nc, with 
a = 0.53 obtained through a best fit to the data 



field typically involves large collaborations: hence research associates, even at 
relatively young academic ages, possibly show bibliometric indicators typical 
of older academic staff in the field. 

In the spirit of our approach, aiming at defining some benchmarks and 
thresholds rather than individual rankings, our first preliminary conclusion 
is that the total number of citations Nc is as good an indicator as the h- 
index itself: this ina plies that ^^^^ is a quite reasonable proxy of the h- 
index iNielsenl. boosi 



3. Time evolution 

3.1. The time dependence of the total number of citations 

The individuals under consideration have academic ages Aa ranging from 

3 to 46 years, and the dimension of the corresponding age groups ranges from 

4 to 63 units. We discarded a (small) number of cases corresponding to age 
values outside the above mentioned range because of the scarce statistical 
significance of the corresponding samples, and looked at the behaviour of an 
indicator defined as the total number of citations divided by the academic 
age {Nc/Aa) as function of the academic age. 

The result is shown in Fig. [2j Despite some fiuctuations mainly due to the 
small population of some age groups, three distinct time ranges characterized 
by different behaviours of the indicator are clearly visible: 
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rullr ASSOr riese iOtai 


Sample Plateau 


a 3.23 3.28 3.64 3.33 
corr u.y/4 u.yfo u.yoc) u.y/z 


71 4.4 
oz z.y 


a 3.64 3.33 3.92 3.60 

~ ^-^P ^^^^ n ofi7 n OR? n qoc; n oc;i 
coTT u.yo/ u.yo/ u.oyo u.yoi 


110 4.1 
yy o.z 


a 3.67 3.07 3.14 3.42 

the 


79 3.8 

88 2.9 


a 3.49 3.42 3.26 3.43 
Matter -exp ^^^^ ^ 0.976 0.920 0.969 


158 3.7 
155 3.0 


a 3.97 3.47 3.42 3.73 
Matter -the ^^^^ ^ ^ ^ ^ 


64 3.9 
47 3.0 


. a 3.45 3.09 2.83 3.19 

AppFhyS ^^^^ Q gg^ Q gg^ Q ggg ^ ^g^ 


72 3.1 

52 2.3 


All 

corr 0.97 


554 3.8 
473 2.9 



Tabic 1: Summary of the parameter a, and the corresponding measure of correlation 
corr, evaluated considering different Physics research fields (Astro: Astronomy and As- 
trophysics; HEP: High Energy Physics; Matter: Condensed Matter, Atomic, Molecular 
and Optical Physics; AppPhys: Applied Physics; cxp: Experimentalists; the: Theoreti- 
cians) and academic career progress (FuUP: Full professors; AssoP: Associate professors; 
Rese: Research associates). The columns labeled Sample and Plateau are relative to the 
time dependence of the /i-index, and they will be discussed further down in this paper. 

• In the academic age range between 3 and 12 years the indicator grows 
(roughly) linearly, starting from zero after a two-year time delay from 
the first publication date. Notice that a linear growth in the indicator 
would correspond to a quadratic growth in the total number of cita- 
tions, and this is consistent with a (plausible) pattern of a constant 
publication rate and of a citation rate per publication staying constant 
for some years. Saturation occurs when older publications cease to be 
quoted and the annual citation rate is kept constant only by the influx 
of new publications. 

• In the academic age range between 12 and 24 years the annual citation 
rate (barring fluctuations) stays constant at a (weighted) average value 
of approximately 58 citations per year. 

• A rapid decline follows, and for academic ages above 30 years a new 
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Figure 2: The average number of citations divided by the academic age Nc/Aa vs the 
academic age Aa (sohd broken line). The two dashed hues mark the two constant annual 
citation rates discussed in the text. 

approximate stabilization occurs, with (weighted) values oscillating 
around 39 citations per year. 

The decline in the annual citation rate might very well be explained by 
scientific aging occurring for members of the community, being typically in 
their sixties, and by a possible infiuence of a general growth in the number of 
citations observed in recent times, which tends to bias towards lower citation 
rate older researchers. We note, however, that the sharp decline and the 
subsequent lower level stabilization could be due to a possible bias present in 
our sample: approximately thirty years ago, Italian Universities underwent 
a massive permanent recruiting, and it is believed that not all the people 
recruited in those times (and who are still present in the system, and hence in 
our dataset) managed to keep productivity standards typical of more selected 
groups. 

3.2. The time dependence of the h-index 

In view of the results presented in the previous sections, it is rather obvi- 
ous to explore the behaviour of a time-normalized /i-index obtained by taking 
the ratio between h and the square root of the academic age Aa- 

The result is summarized in Fig. [3l Pleasantly enough, fiuctuations are 
damped and the time pattern observed for the average number of citations 
(Fig. 12]) is even more evident. Following an initial growth, in the range 
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Figure 3: The average /i- index normalized to the square root of academic age hj \J Aa 
vs the academic age A a (black squares with a broken line). The two dashed lines mark 
the two constant values discussed in the text. Circles show the values of the normalized 
/i-index of each researcher in our sample. 



between 12 and 24 years of academic age we observe a plateau value approx- 
imately equal to 3.8 (with a standard deviation approximately equal to 1.6), 
followed by a decline to a plateau value of 2.9 for academic ages larger than 
30 years. The time dependence of h\/ Aa is similar when we consider re- 
stricted communities, with a linear initial growth, followed by a first plateau 
for intermediate academic ages, and a decrease to a lower plateau for longer 
academic ages: the observed plateaus are summarized on table [H under the 
column "Plateau", with the larger value referring to the former plateau, and 
the smaller value to the latter plateau; the column " Sample" shows the num- 
ber of physicists falling in each category. We emphasize that the constant 
behaviour of the quantity Aa over the large region of academic ages be- 
tween 12 and 24 years suggests that indeed its plateau value could be used 
as a quality benchmark. 

4. The time scaled index hy/ Aa 

It is interesting to assess the statistical properties of the distribution of 
the index Aa- The main result is shown in Fig. HI as customary in the 
presence of discrete distributions characterized by some fluctuations, we first 
studied the cumulative distribution of the quantity hj ^ Aa (shown as small 
circles in the figure), for individuals with academic ages in the intermediate 
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A 

Figure 4: The cumulative distribution of average /i-index normalized to the square root 
of academic age hj^J Aa-, for individuals with academic age between 12 and 24 years 
(symbols). The solid line is a best fit using a Gompertz function (see text). The inset 
shows the distribution of the number of individuals with a given /i/^/Aa, obtained taking 
the derivative of the best fitting Gompertz function, scaled so that the area under the 
curve equals the number of researchers in this Aa range (554) . 



range. These d ata are very well described by the Gompertz function j[x) = 
exp(— e^^*-^^^^) [Gompertz | . with c and b parameters quantifying the data. 
The fit to the data using this function is shown by the solid line, where the 
parameters turn out to be c = 0.71 and b = 3.05. The inset shows the 
derivative of the fitting function - in other words, the smoothed distribution 
of expected values for the number of individuals having a given value of 
hj ^ Ap^. We notice that given the skewness of the derivative, the value of 
b (which yields the position of the maximum of the derivative) is smaller 
than the computed average value (see table [I]). On the other hand, the 
inverse of c is a good indicator of the width of the derivative, and it follows 
that 1/c ~ 1.4, which is pleasantly close to the standard deviation (1.6) we 
computed directly from the data. Finally, Fig. [5] shows an enlargement of the 
central region of Fig. |3l the average value of hj ^ Aa and the one standard 
deviation lines nicely interpolate the distribution of h/ \/ Aa-, for the whole 
range of academic ages considered. 
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Figure 5: An enlargement of the central region (Aa in the range from 12 to 24 years) of 
Fig. [3l The symbols show h^/A^ for each individual; the solid and dashed lines mark the 
average and the one standard deviation values of h\/ A^ (3.8 ± 1.6). 



5. The comparison with the contemporary /i-index 

It is also interesting to compare the index h\J Aa to the contemporary h 



index [Sidiropoulos. Katsaros and Manolopoulod . l2006| [he in the following), 



which has been introduced to assess individuals who have been scientifically 
active over widely different ranges of time. We recall that he depends on the 
current year and it is evaluated by renormalizing the number of citations 
Ui of the paper z, published in the year as hi = ni'y{yn — yi + 1)^'^, and 
using the hi sequence to compute he, with the same algorithm used for the 
/i-index. 

To carry out the comparison, we took the widely used values 7 = 4 
and 6 = 1. First, we plotted on Fig. |6]the contemporary /i-index (he) as a 
function of the academic age. Circles are the he index for each individual 
in our sample, and the black square joined by a broken solid line show the 
average h^ for each A a class. It is very interesting to notice that, after an 
initial region where the average he grows linearly, for A^s larger than 12 
years the average he remains constant, up to the largest A^ present in our 
sample. 

The comparison between he and h^/ Aa is summarized in Fig. [TJ where 
we plotted he versus the index hj ^ Aa, for all individuals in our sample. It is 
clear from the figure that the two indicators are proportional to each other, 
and this is confirmed by a best fit using a cubic polynomial, shown in the 




Figure 6: The average he vs the academic age Aa (black squares with a broken Hne). 
Circles show the values of the he of each researcher in our sample. 



figure as a sofid line, which appears indistinguishable from a straight line. 
The conclusion is that, at least for our sample, the index h/^/AA appears 
to yield the same information provided by the contemporary /i-index, and 
hence the two indexes are interchangeable (at least within a wide academic 
age range): we like to remark, however, that the evaluation of the quantity 
h/ appears to be easier than the evaluation of the contemporary /i- index, 
and that the contemporary h index requires two arbitrary parameters (7 and 
6) which need to be introduced empirically. It will be matter for further work 
to assess whether the proportionally between these indexes is also observed 
when different values for the two parameters are taken. 

6. Conclusions 

We have produced evidence that the index /i/\/C4^, averaged over suffi- 
ciently large groups, is a sensible proxy for the contemporary /i-index, and 
tends to stay constant in time in the interval between 12 and 24 years of 
research activity, which is the typical range for researchers to apply for per- 
manent and/or higher positions. The plateau value the index hj \J Aa might 
therefore be used as a quality benchmark, even if its eminently statistical ori- 
gin does not make it proper to employ it for any kind of ranking of individual 
researchers. 

As for the numerical value of the plateau, one must not forget that our 
analysis implied the aggregation of widely different typologies of researchers. 
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Figure 7: Comparison between the contemporary /i-indcx he and the index h/\/AA (sym- 
bols). The solid line is a best fit using a cubic polynomial. 

and therefore the numbers we obtained are weighted averages of the values 
corresponding to each homogeneous subgroup of researchers. This should 
not affect our general conclusions, since we have revealed a common trend, 
and the lack of homogeneity could, at most, obscure specific trends that are 
peculiar to a subgroup. 
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